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SUMMARY 


Classical  geostatistical  methods  arc  powerful  tools  to  study  the  spatial- temporal  structure  of  sta¬ 
tionary  processes.  Separability  is  also  a  common  assumption  to  avoid  many  of  the  problems  of  space 


and  time  modeling.  This  subclass  of  separable  spatial- temporal  processes  has  several  advantages, 
including  rapid  fitting  and  simple  extensions  of  many  techniques  developed  and  successfully  used 
in  time  series  analyses  and  geostatistics.  However,  in  real  applications  spatial-temporal  processes 
arc  rarely  stationary  and  separable;  then  an  important  extension  of  these  traditional  geostatistical 
methods  is  to  processes  that  have  a  nonstationary  and  nonseparable  covariance.  In  this  work,  some 
new  approaches  to  estimate  and  model  nonstationarity  and  nonseparability  are  presented.  The 
most  important  scientific  contributions  of  the  research  proposed  here  are;  the  estimation  of  the 
complex  spatial-temporal  dependence  of  environmental  processes  in  general  situations  (nonstation¬ 
arity,  anisotropy,  nonseparability),  and  the  introduction  of  flexible  models  for  spatial  prediction 
of  environmental  processes.  We  apply  the  statistical  methods  proposed  here  to  model  the  spatial- 
temporal  patterns  of  wind  fields,  and  for  wind  field  mapping,  which  combines  numerical  meteoro¬ 
logical  model  output  with  observational  data.  The  data  used  in  this  phase  of  the  research  came 
from  the  subregion  surrounding  and  including  the  Chesapeake  Bay  for  July  2002.  The  ability  to 
accurately  forecast  wind  speed  and  direction  in  coastal  locations  is  critical  for  support  of  shipping, 
marine  recreational  activities,  and  national  security  issues  related  to  contaminant  transport. 
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1  Introduction 


Increased  emphasis  on  homeland  security  issues  provides  an  additional  incentive  for  developing 
accurate  meteorological  analysis  and  forecast  fields  for  the  coastal  wind  regime.  The  transport  of 
chemical  or  biological  agents  is  critically  dependent  on  accurate  forecasts  of  meteorological  fields 
such  as  the  near-surface  wind  speed  and  direction.  These  forcasts  in  turn  depend  on  the  accuracy 
of  the  observed  data  fields  that  go  into  the  numerical  meteorological  model  which  produces  these 
forecasts.  The  emphasis  in  this  paper  will  be  on  the  development  of  a  procedure  that  could  poten¬ 
tially  be  used  to  improve  either  the  assimilation  of  surface  wind  observations  into  a  model  initial 
field,  or  to  improve  the  accuracy  of  post-processing  algorithms  run  on  meteorological  model  output. 
The  wind  field  in  coastal  regions  is  influenced  by  complex  physical  processes  associated  with  small- 
scale  variations  in  terrain,  and  strong  gradients  in  moisture,  temperature,  and  surface  roughness. 
The  fundamental  diurnal  changes  in  the  structure  of  the  atmospheric  boundary  layer  differ  greatly 
from  the  land  surface  to  the  water  surface.  In  part  this  is  due  to  the  fact  that  radiant  energy 
arriving  at  the  water  surface  is  distributed  over  a  large  volume  of  water,  while  over  land  this  energy 
is  concentrated  at  the  earth’s  surface.  In  addition,  over  the  water  surface  persistent  evaporation 
acts  to  lower  the  temperature  of  the  air  over  the  water  surface  in  comparison  to  that  over  the  land 
surface.  Differences  in  the  specific  heats  of  water  and  soil  also  play  a  role,  especially  during  the 
cooling  cycle  which  begins  in  the  evening  hours.  These  effects  lead  to  fundamental  differences  in 
heat  fluxes  between  the  two  surfaces.  As  a  result,  the  formation  of  a  thermal  internal  boundary 
layer  often  takes  place.  Another  result  of  these  processes  is  that  the  stability  regimes  over  water  can 
differ  greatly  from  those  over  land.  Even  the  larger  scale  processes  associated  with  these  differences 
between  the  land  boundary  layer  and  marine  boundary  layer  are  difficult  for  a  mesoscaie  model  to 
capture. 

Recent  work  by  Titlow  and  McQueen  (1999)  has  indicated  that  the  current  versions  of  the  available 
operational  meteorological  models  are  incapable  of  producing  reliable  high- resolution  wind  forecasts 
during  periods  of  weak  synoptic-scale  forcing.  Although  sea-breeze  processes  have  been  adequately 
simulated  in  research  studies  (e.g.,  Rao  and  Fuelberg  2000),  operational  prediction  of  small-scale 
processes  such  as  the  land-sea  breeze  circulation  remains  problematic  for  real-time  NWP  models 
available  from  NCER  This  is  true  even  at  high  resolution  (e.g.,  4  km)  grid  spacing.  Currently 
Titlow  uses  the  output  from  these  models  as  one  component  in  his  efforts  to  provide  detailed  wind 
field  forecasts  over  the  Chesapeake  Bay.  Any  procedure  that  would  make  these  meteorological 
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model  output  wind  fields  more  useful  to  him  and  others  like  him  would  be  of  great  benefit  to 
recreational  interests  (e.g.,  sailing,  fishing,  and  wind  surfing),  commercial  interests  (e.g.,  shipping 
and  aviation),  and  environmental  interests. 

In  the  summer  of  2001,  NASA  and  the  Defense  Threat  Reduction  Agency  (DTRA)  sponsored  a 
field  experiment  (The  Chesapeake  Bay  Numerical  Weather  Prediction  Model  Experiment)  over  the 
Chesapeake  Bay  (see  Figure  1)  to  collect  data  pertinent  to  the  numerical  modeling  community. 
Data  were  collected  over  the  period  July  16  to  July  31.  These  days  were  chosen  in  the  hopes  that 
weakly  forced  flow  conditions  would  prevail  for  a  good  portion  of  this  period.  Unfortunately,  only 
the  period  21-23  July  fully  met  that  criterion.  It  was  hoped  that  the  data  collected  could  be  used 
to  improve  the  ability  of  mesoscale  models  to  forecast  the  state  of  the  atmosphere  over  complex 
terrain  when  small  scale  processes  dominate  the  flow  regime,  either  by  improving  initial  conditions 
or  in  the  development  of  a  post-processing  technique  that  could  be  applied  to  meteorological  model 
output.  Limited  data  collection  was  continued  into  the  summer  of  2002.  The  criterion  of  weakly 
forced  conditions  was  better  met  in  2002  than  in  2001.  This  study  will  examine  data  from  the  2002 
period.  During  this  period  both  observed  data  and  MM5  (Penn  State  University/National  Center 
for  Atmospheric  Research  Mesoscale  Model  Version  5)  output  fields  were  available. 

The  research  presented  in  this  paper  represents  a  novel  approach  to  wind  field  modeling,  analysis 
and  spatial  prediction. 

The  difficult  challenge  of  modeling  the  spatial  structure  of  wind  fields  over  time  can  be  overcome 
by  using  separable  processes.  A  spatial-temporal  field  Z(s,^),  where  s  represents  space  and  t  time, 
is  separable  if  Cov{Z(s,  t),  Z{s\  ^^)}  =  ^1(5,  s')C2(^,  t^)  for  some  spatial  covariance  C\  and  temporal 
covariance  C2.  This  class  of  spatial-temporal  processes  offers  enormous  computational  benefits, 
because  the  covariance  matrix  can  be  expressed  as  the  Kronecker  product  of  two  smaller  matrices 
that  arise  separately  from  the  temporal  and  purely  spatial  processes,  and  then  its  determinant 
and  inverse  are  easily  determinable.  Thus,  separability  is  a  desirable  property  for  spatial-temporal 
processes,  but  it  is  usually  a  rather  unrealistic  assumption  for  large  spatial-temporal  domains,  and 
there  is  little  written  about  nonseparable  models.  What  is  novel  about  this  paper  is  the  introduction 
of  new  classes  of  space-time  nonstationary  and  nonseparable  families  of  covariance  models.  New 
fitting  algorithms  are  developed  to  estimate  the  space-time  covariance. 

The  true  wind  field  (speed  and  direction)  is  a  quantity  that  is  not  known.  It  is  a  function  of  a 
spatial-temporal  trend  term  and  a  zero  mean  process,  which  has  a  nonseparable  and  noii-stationary 
covariance  that  might  change  with  location.  In  this  research,  the  true  wind  is  determined  by 
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conditioning  on  the  observed  wind  field  at  specified  locations  and  on  the  wind  fields  derived  from 
MM5,  a  widely  used  mesoscale  meteorological  model.  The  observed  wind  field  at  any  particular 
location  is  treated  as  a  function  of  the  true  (but  unknown)  wind  and  measurement  error.  The  wind 
field  from  the  numerical  model  is  treated  as  function  of  a  linear  and  multiplicative  bias  and  a  term 
which  represents  random  deviations  with  respect  to  the  true  wind  process. 

A  Bayesian  approach  is  taken  to  providing  information  about  the  true  wind  field.  To  obtain 
the  posterior  predictive  distribution  of  the  true  wind  field  given  the  observed  and  MM5  forecast 
wind  fields,  prior  distributions  must  be  identified  for  the  set  of  parameters  related  to  both  the 
observed  wind  and  the  model  forecast  wind  field  as  well  as  the  spatial/temporal  nature  of  the 
data  sets.  In  addition,  the  likelihood  must  be  specified.  In  order  to  to  arrive  at  the  posterior 
predictive  distribution,  a  multi-stage  Gibbs  sampling  approach  is  used  to  simulate  values  for  the 
set  of  parameters  from  the  posterior  distribution  of  the  parameter  vector  given  the  observed  and 
the  MM5  predicted  wind  fields. 

This  approach  provides  a  representation  of  the  wind  field  that  is  potentially  more  accurate  than 
that  from  operational  data  assimilation  systems.  In  a  sense,  this  procedure  can  be  considered  to  be 
a  post-model  run  hierarchical  Bayesian  nudging  procedure.  This  Bayesian  modified  model  output 
field  would  provide  a  more  accurate  picture  of  the  atmosphere  for  subsequent  model  runs.  A  similar 
approach  was  proposed  by  Best  et  al.  (2000),  who  relate  different  spatially  varying  quantities  to 
an  underlying  unobservable  random  field  for  a  regression  analysis  of  health  and  exposure  data.  In 
our  method  we  also  have  an  underlying  unobservable  process,  but  the  statistical  model  we  propose 
is  novel.  We  present  a  new  way  to  relate  meteorological  variables  to  an  underlying  process:  in  this 
case  the  true  wind  values.  We  also  propose  new  models  for  nonseparable  processes.  Wikle  et  al. 
(2001)  presented  an  approach  to  combining  data  from  different  sources  to  improve  the  prediction 
of  wind  fields.  Wikle  et  al.’s  approach  is  a  conditional  one  in  which  all  the  spatial  quantities  are 
defined  through  a  series  of  statistical  conditional  models.  In  their  approach  the  output  of  the 
numerical  models  is  treated  as  a  prior  process.  We  present  here  a  simultaneous  representation  of 
the  data  and  the  output  of  numerical  models  in  terms  of  the  underlying  truth.  Our  method  is  very 
different  from  Wikle  et  al.’s,  since  we  do  not  treat  the  output  of  the  numerical  models  as  a  prior 
process,  but  as  another  source  of  data.  Therefore,  we  write  the  output  of  the  models  in  terms  of 
the  underlying  truth,  taking  into  account  the  potential  bias  of  the  numerical  models. 

This  paper  is  organized  as  follows.  In  Section  2  we  present  new  classes  of  nonseparable  and  nonsta¬ 
tionary  models  for  space-time  processes.  In  Section  3  we  model  the  spatial-temporal  trend  using 
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a  space-time  dynamic  model  with  varying  coefficients.  In  Section  4  we  describe  the  algorithm  for 
wind  field  mapping.  In  Section  5  we  apply  the  methods  described  in  this  paper  to  wind  field  data. 


2  Modeling  the  spatial-temporal  dependence  structure 

2.1  Spectral  representation 

A  random  field  Z  in  is  called  weakly  stationary  (or  stationary),  if  it  has  finite  second  moments, 
its  mean  function  is  constant  and  it  possesses  an  autocovariance  function  C,  such  that  C(x  —  y)  = 
cov{Z(x),  Z{y)}.  If  C{h)  =  Co{\h\)  for  some  function  Cq,  then  the  process  is  called  isotropic. 

The  spectral  representation  of  a  random  process  Z(x)  is  always  interpreted  as  its  representation  in 
the  form  of  the  superposition  of  sine  and  cosine  waves  of  different  frequencies  cj.  If  Z  is  a  stationary 
random  field  with  autocovariance  C,  then  we  can  represent  the  process  in  the  form  of  the  following 
Fourier-Stieltjes  integral  (see  Yaglom  (1987)  for  example): 

Z(x)  =  [  exp{i:K^(v)dY{(jj)  (1) 

7r2 

where  the  Y  are  random  functions  that  have  uncorrelated  increments  with  complex  symmetry 
except  for  the  constraint,  dY{Lj)  =  —dY^{—Lj),  needed  to  ensure  Z(x)  is  real- valued.  Y^  denotes 
the  conjugate  of  Y.  Using  the  spectral  representation  of  Z  and  proceeding  formally, 

C(x)  =  [  exp(ix^u;)F(du;),  (2) 

Jr^ 

where  the  function  F  is  a  positive  finite  measure  and  is  called  the  spectral  measure  or  spectrum 
for  Z.  The  spectral  measure  F  is  the  mean  square  value  of  the  process  K, 

E{|y(a;)|2}  =  F(a;). 


If  F  has  a  density  with  respect  to  Lebesgue  measure,  it  is  the  spectral  density,  /,  which  is  the 
Fourier  transform  of  the  autocovariance  function: 


=  7^^  /  exp(-ix^a;)C(x)dx. 

(27r)  J^2 


Each  random  process  (not  necessarily  a  stationary  process)  having  a  covariance  function  of  the  form 
(2),  where  F  is  a  function  of  bounded  variation,  is  harmonizable.  Such  a  process  is  representable  as 
the  Fourier-Stieltjes  integral  (1),  where  Z  is  a  random  function  whose  spectral  function  coincides 
with  F. 
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2.2  A  general  class  of  space-time  covariance  models 

A  class  of  nonseparable  spatial-temporal  models  have  been  proposed  by  Cressie  and  Huang  ( JASA, 
99),  Gneiting  (JASA,  02),  and  by  Stein  (2003).  Here,  we  define  a  generalized  class  of  nonstationary 
and  nonseparable  spatial- temporal  covariance  models.  For  this  class,  the  spectral  representation 
itself  and  the  corresponding  spectral  distribution  function  (or  spectral  density)  can  change  slowly 
in  space  and  time.  Let  Z  be  a  general  space-time  process,  we  use  the  following  representation 

Z(x,  t)=  /  exp(ix^cj -h  itT)dYxt(u;,T),  (3) 

We  are  going  to  assume  throughout  this  Section  that  the  mean  of  the  process  is  zero.  We  will  define 
a  model  for  the  mean  in  Section  3.  We  capture  the  potential  lack  of  stationarity  by  allowing  the 
spectral  process  V  associated  with  Z  to  change  in  space  and  time.  Other  nonstationary  approaches 
for  spatial  prediction  have  been  proposed  by  Sampson  and  Guttorp  (1992),  Smith  (1996),  Haas 
(1995),  Higdon  et  al.  (1999),  Wikle  et  al.  (2001),  Fuentes  (2001)  and  (2002),  Fuentes  and  Smith 
(2001),  and  Nychka  et  al.  (2003)  among  others. 

The  general  spectral  representation  proposed  here  (3)  provides  an  ideal  framework  to  model  complex 
space-time  dependent  structures.  Next,  we  discuss  two  flexible  space-time  covariance  models  that 
correspond  to  processes  with  this  type  of  spectral  representation. 


2.2.1  Mixture  of  local  spectrums  (Spectral  model) 

A  particular  case  of  the  general  representation  in  (3)  is  when  the  lack  of  stationarity  is  explained 
by  allowing  the  amplitude  of  the  spectral  process  Y  to  be  space-time  dependent.  This  means 

Z{x.,t)-  I  cxp{ix^u;  +  itT)<p^^t{<^,T)dYo{u},T)  (4) 

where  (/>x,t(^j 'T')  is  an  amplitude  function  (space-time  dependent),  and  Yq  is  a  Wiener  process 
(space-time  invariant),  which  satisfies  the  relation 

E  [Vb(u;,  r)y[{'(u?',  r')]  =  5(u?  —  u)^)6{t  —  T^)(kj^dr' 

where  5  is  the  delta  Dirichlet  function.  This  is  a  space-time  version  of  the  evolutionary  spectrum 
presented  by  Priestley  (1965).  We  assume  the  functions  </>x,t(t<^> 'J^)  satisfy  the  condition 

/  |</>x,t(‘*',T)pdwdr  <  oo  (5) 

JR3 

for  all  X  and  t.  The  functions  't)  must  also  satisfy  0x,t(^?'3^)  =  ensure  Z(x, t) 

is  real- valued. 
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Then,  it  is  easy  to  see  that  the  covariance  function,  C,  of  the  process  Z(x,  t)  is  given  by  the  formula 


cov{Z(xi,ti),Z(x2,<2)}  =  C'(xi,fi;X2,<2) 

exp{i(xi  -  X2)'^w)}exp{i(fi  -  f2)'^r}<;!!>x,,t,  (w, 


[■ 

7R3 


In  particular 

var{Z(x,f)}  =  C(x,<;  ,x,t)  =  [  |0x,t(w,  r)|^da>(iT 


(6) 

(7) 


SO  that  condition  (5)  is  necessary  for  the  variance  of  Z(x,i)  to  be  finite  at  all  x  and  t,  i.e.  for  the 
existence  of  a  covariance  function  C(xi ,  X2,  ^2)-  The  representation  in  (4)  may  be  interpreted  as  a 

representation  of  the  process  Z  in  the  form  of  a  superposition  of  sinusoidal  oscillations  with  different 
frequencies  ip  =  (u;,  r)  and  random  amplitudes  (j>x,t{^y  T)dY{u^  r)  varying  over  space,  i.c.  Z(x,  t)  is 
a  superposition  of  amplitude  modulated  random  oscillations.  According  to  this  interpretation,  (7) 
describes  the  distribution  of  the  “total  power”  of  the  process  Z(x,i)  at  location  x  and  time  t  over 
the  frequencies  *0,  hence  the  contribution  from  the  frequency  0  is  Therefore 

the  function  defined  by  the  relation 


(8) 


will  be  called  the  spatial  spectral  distribution  function  of  the  process  Z,  and  /x,t(^7  t)  ~ 
is  the  spatial  spectral  density  of  Z. 

There  exist  different  representations  of  the  form  (3)  for  a  spatial-temporal  process  Z,  each  represen¬ 
tation  is  based  on  a  different  family  of  0s(0)  functions,  where  we  write  s  =  (x,  t)  and  0  =  (a;,  r)  to 
simplify  the  notation.  This  problem  is  similar  to  the  selection  of  a  basis  for  a  vector  space.  Apart 
from  that,  it  would  not  be  physically  meaningful  to  interpret  0  as  the  frequency  in  all  cases.  In 
the  physical  theory  of  oscillations  the  function  As(0)  =  0s(0)  cxp(2s^0)  is  said  to  describe  the 
amplitude  modulated  oscillation  of  frequency  0  only  if  the  “amplitude”  is  a  slowly  varying 

compared  to  exp{fs^0}  function,  i.c.  if  the  Fourier  transform  of  0s (0)  as  a  function  of  s  includes 
mainly  frequencies  much  lower  than  0.  It  is  even  often  assumed  that  this  transform  must  be  eoii- 
centrated  in  a  neighborhood  of  zero  frequeney.  We  restriet  the  permissible  variability  of  the  function 
0s (0)  of  s  by  considering  only  functions  08(0)  that  admit  a  generalized  Fourier  representation 

08(0)  =  /  (9) 

with  \dH^{(j)\  having  its  maximum  at  =  0  for  any  fixed  0.  This  condition  guarantees  that  the 
Fourier  transform  of  0s(0),  a  function  of  s,  includes  mainly  frequencies  much  lower  than  any  0, 
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and  it  has  been  suggested  by  Priestley  in  the  time  series  context.  Since  (j)s{'il>)  is  a  slowly  varying 
function  of  space  and  time,  it  is  clear  that  the  process  Z  may  be  regarded  as  being  “approximately 
jointly  stationary”  within  subregions  in  our  space-time  domain  D.  If,  however,  we  examine  the 
behavior  of  Z  within  two  subregions  which  are  sufficiently  far  apart,  we  could  find  that  although  Z 
is  practically  stationary  in  both  subregions,  the  spectral  distribution  function  of  the  two  “portions” 
of  Z  will,  in  general,  be  different  (i.e.,  the  spectral  distribution  of  the  power  of  Z  varies  on  space 
and  time).  Since  the  functions  (f>s{'ij>)  =  1  clearly  satisfy  the  conditions  to  be  imposed  on  the 

representation  (3)  certainly  includes  all  the  spatial- temp  oral  stationary  processes  having  a  finite 
variance.  A  particular  case  is  when  is  a  separable  function  of  space  and  time,  i.e. 

=  </'x  (10) 

When  (f)  is  of  the  form  (10)  the  spatial-temporal  process  is  separable, 
cov{Z(xi,fi),Z(x2,f2)} 

=  f  exp{i(xi  -  X2)^a;)}exp{i(fi  - 

f  r  (11) 

=  /  exp{i(xi -X2)'^a;)}</)LV(a;)(</)(cV)‘'(a;)da;  /  exp{i(fi  - 
Vr2  Vri 

where  and  are  spatial  and  temporal  eovarianee  functions. 

We  propose  a  more  general  model  for  (f)  that  has  the  separable  model  in  (10)  as  a  particular  case  . 
We  model  as  a  mixture  of  local  spectral  (amplitude)  functions, 

k 

</>s(i/>)  =  -  Si)</)s,(t/))  (12) 

i=l 

where  each  function  explains  the  spatial-temporal  structure  of  Z  in  a  neighborhood  of  Si. 

Locally  (in  a  neighborhood  of  Sj),  we  propose  the  following  general  nonseparable  parametric  model 
for  (psi ,  that  has  a  separable  model  as  a  particular  ease, 

0s, (a;,  +  6,||a;||2|r|2)-(-.+'^)/2  (13) 

The  parameter  explains  the  rate  of  decay  of  the  spatial  correlation  component.  For  the  temporal 
correlation,  the  rate  of  decay  is  explained  by  and  7*  is  a  scale  parameter.  The  parameter  >  0 
measures  the  degree  of  smoothness  of  the  process  Z  at  Sj,  the  higher  the  value  of  i^i  the  smoother 

Z  would  be.  The  Fourier  transform  of  <i>s^{u)^u)  is  a  Matern-type  covariance  function.  If  e*  =  0  we 


8 


have  a  3-d  Matcrn  type  model  with  different  spatial  and  temporal  ranges,  which  takes  into  account 
the  change  in  units  from  the  spatial  to  the  temporal  domain.  If  =  1  we  have  a  separable  model, 

=  7i(a?  +  +  |r|2)-('^'+^)/2.  (14) 

Then,  the  corresponding  (local)  covariance  is  separable  (as  in  (11)). 


2.2.2  Mixture  of  local  space-time  models  (Spatial  domain) 

Another  particular  case  of  the  general  representation  in  (3)  is  when  the  spectral  process  yx,t(^)'7") 
is  modeled  as  a  mixture  of  (orthogonal)  local  stationary  spcK^tral  processes  Yi  for  i  =  1, . . . ,  /c  that 
explain  the  space-time  dependence  structure  in  subregions  of  stationarity  5i, . . . 

k 

-Si)yi(a;,T)  (15) 

i=l 

where  each  Yi  explains  the  spatial-temporal  structure  of  Z  in  a  subregion  of  stationarity  Si.  Thus, 
we  can  write  the  process  Z  in  terms  of  orthogonal  processes  Zj,  for  z  =  1, . . . ,  /c 

k 

Z(x,t)  =  ^/("(s  -  Si)Zi(x,t)  (16) 

i=l 

where  for  i  —  1, . . . ,  /c,  Zi  is  the  space-time  stationary  process  associated  with  Yi. 

The  corresponding  covariance  function  for  a  process  Z  with  the  spectrum  given  by  (15)  is 

k 

COv{Z(xi,<i),Z(X2,<2)}  =  ^^(si  -  Si)K{s2  -  Si)Ci{xi  -  X2,  <1  -<2)  (17) 

i=l 

where  Sj  =  (xi,ti),  and  each  Ci  is  a  stationary  covariance  (corresponding  to  the  spectral  process 
Yi)  that  explains  the  space-time  dependency  in  a  subregion  of  stationarity  Si. 

We  propose  the  following  parametric  model  for  Q, 


Ci(x,  t) 


(2^‘/2||(x,/3i)||M)"-^..(2<./'1l(x,A<)IIM), 


V2| 


(18) 


where  is  a  modified  Bessel  function  and  the  covariance  vector  parameter,  0,  =  (i^i, p,, /?i), 
changes  from  subregion  to  subregion  to  explain  the  lack  of  stationarity.  The  parameter  pi  measures 
how  the  correlation  decays  with  distance;  generally  this  parameter  is  called  the  range.  The  param¬ 
eter  is  the  variance  of  the  random  field,  i.e.,  ai  =  var(Zi(x,  ^)),  where  the  covariance  parameter 
ai  is  usually  referred  to  as  the  sill.  The  parameter  i/i  measures  the  degree  of  smoothness  of  the  local 
stationary  process  Zi  (as  in  (14).  The  parameter  (3i  is  a  scale  factor  to  take  into  account  the  change 


9 


of  units  between  the  spatial  and  temporal  domains.  This  parametric  model  for  C*  corresponds  to 
a  3-d  Matern  type  covariance  with  an  extra  parameter  {j3i)  that  can  be  interpreted  as  a  conversion 
factor  between  the  units  in  the  space  and  time  domains.  A  more  general  model  for  Ci  would  be 
given  by  the  Fourier  transform  of  /g.  in  (13).  When  €{  =  0,  the  model  for  Ci  in  (18)  is  just  a 
particular  case  of  (13). 

The  representation  for  a  general  space-time  process  given  in  (4),  with  0  defined  in  (12),  also 
corresponds  to  a  mixture  of  space-time  stationary  processes 

k 

Z(x,t)  =  ^/^(S-Si)z;(x,t)  (19) 

2=1 

where  each  Z*  is  a  stationary  process  with  covariance  Q,  and  K  is  the  kernel  function  in  (12).  The 
difference  between  this  representation  and  (16)  is  that  the  local  stationary  processes  in  (19)  are 
correlated  to  each  other,  and  in  (12)  they  are  orthogonal.  The  cross-covariance  between  Z*  and  Zj 
is  given  by 

cov{Z,*(xi,ii),Zj(x2,f2)}  =  f  exp{i(xi -X2)’^w)}exp{2(ii 

(20) 

where  the  functions  /,  =  0?  and  fj  —  0^  are  the  square  of  the  amplitude  functions. 

3  Spatial-temporal  Trend 

In  the  previous  section  of  this  manuscript  we  have  introduced  some  new  models  for  the  spatial- 
temporal  covariance/spectrum.  In  this  section  we  present  spatial-temporal  models  for  the  large 
scale  structure  or  trend  using  covariates  with  dynamic  coefficients. 

We  represent  the  large  scale  structure  of  Z  using  a  space-time  dynamic  statistical  model: 

k 

^  XI  0  +  ^(s.  <).  (21) 

2 

where  {fi}i  are  k  covariates  (e.g.  sine  and  cosines  and  geographic  data)  of  interest  with  coefficients 
l3i  that  vary  in  space  and  time.  The  residual  term  s{s^t)  is  a  space-time  correlated  error  that 
explains  the  spatial-temporal  short  scale  structure. 

We  model  the  dynamic  coefficients  /Si  using  a  hierarchical  model  in  terms  of  an  overall  time  com¬ 
ponent  Q-  space-time  process  7i(s,f), 

I3iis,t)  =  7i,t+7i(s,t) 
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7i,t  = 

7i(s,t)  =  yi(s,t-  1)  +  Tj(s,t) 

7]  and  u  are  independent  white  noise  processes.  We  estimate  the  parameters  in  the  model  with 
their  posterior  distribution  (given  the  data  D),  i.e.  P{P\D). 

4  Wind  mapping  combining  measurements  and  model  output 

In  our  statistical  approach  to  wind  field  mapping,  which  combines  observed  data  and  model  out|)Ut 
(Fuentes  and  Raftery,  2001),  we  do  not  consider  the  measurements  at  the  monitoring  stations  to  be 
the  “truth”  because  of  measurement  error.  Thus,  we  assume  there  is  an  underlying  (unobserved) 
field  Z(x, f),  where  measures  the  “true”  meteorological  variable  at  location  x  and  time  t. 

At  station  x,  we  make  an  observation  Z(x,t)  corresponding  to  the  observation  at  this  station  at 
time  f,  and  we  assume  that 

Z{^,t)  =  Z{x,t)  +  (22) 

where  e(x,  i)  ~  A^(0,(t|)  represents  the  measurement  error  (nugget)  at  location  x. 

The  true  underlying  process  Z  is  a  spatial  process  with  a  nonstationary  and  nonseparable  covari¬ 
ance, 

Z(x,  t)  =  n{x,  t)  +  £(x,  t),  (23) 

where  Z(x,  f)  has  a  space-time  trend,  /i(x,t),  that  is  modeled  as  a  function  of  some  meteorological 
and  geographic  eovariates  using  model  (21). 

We  assume  that  Z(x,f)  has  zero- mean  correlated  errors  £:(x,  ^).  The  process  £:(x,i)  has  a  nonsta¬ 
tionary  and  nonseparable  covariance  with  parameter  vector  6  that  might  change  with  location. 

We  could  model  the  output  of  the  meteorological  models  as  follows: 

Z(x,  t)  =  a(x,  t)  -f  6(x,  t)  +  5(x).  (24) 

Here,  the  parameter  function  a(x,  t)  measures  the  additive  bias  of  the  weather  metereological  models 
at  location  x  and  time  t,  and  the  parameter  function  6(x,  t)  accounts  for  the  multiplicative  bias 
ill  the  metereological  models.  The  process  ^(x,t)  ~  7V(0,a'|)  explains  the  random  deviation  at 
location  x  with  respect  to  the  underlying  true  process  Z(x,t). 

Wind  field  mapping 

We  seek  to  obtain  more  reliable  wind  field  spatial  predictions  by  combining  model  output  with 
observations.  Thus,  for  the  prediction  of  wind  field  we  simulate  values  of  Z  from  the  posterior 
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predictive  distribution  of  the  true  underlying  wind  process  given  the  model  output  and  the  obser¬ 
vations: 

P{Z\Z,Z).  (25) 

This  is  a  calibrated  wind  prediction.  It  could  also  be  considered  a  data  assimilation  approach  in  the 
sense  that  we  are  using  the  data  to  improve  the  wind  forecast,  but  in  contrast  with  the  traditional 
approach  of  perturbing  the  model  inputs  and  assimilating  data  using  Kalman-Filter  methods,  we 
perturb  the  output  using  the  data  to  produce  a  calibrated  model  output. 

5  Application 


Figure  1:  The  box  shows  our  study  domain  in  the  Chesapeake  Bay  region. 

The  red  box  in  Figure  1  shows  our  study  domain  in  the  Chesapeake  Bay  region.  Forecasting  the 
wind  field  over  the  Bay  is  of  long-standing  interest  because  it  is  composed  of  many  features  that 
are  spatially  and  temporally  complex  in  nature.  The  goal  of  our  study  is  to  understand  the  spatial- 
temporal  structure  of  wind  fields  over  the  Bay  and  to  obtain  more  reliable  wind  field  predictions 
by  combining  model  output  with  observations.  The  analysis  for  wind  speed  on  July  21,  2002  is 
presented  here. 
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5.1  Description  of  the  data 

Both  observed  and  MM 5  OOZ  model  output  wind  fields  were  used  in  this  study.  The  observed  data 
wore  eolleeted  by  an  independent  suite  of  meteorologieal  stations  owned  by  Weatherflow,  Inc  (see 
Fig.  2)  and  provided  by  Jay  Titlow  of  Weatherflow,  Ine.  The  MM5  data  were  furnished  by  John 
McHenry  at  Baron’s  Advanced  Meteorological  Systems.  The  observed  wind  data  were  collected  at 
the  locations  shown  in  Figure  2.  The  anemometers  were  location  from  9  to  18  m  above  ground 
level,  depending  on  the  location.  Adjustment  of  the  wind  speed  values  to  the  standard  10  m  height 
was  accomplished  using  MonimObukhov  similarity  theory  (Arya,  2001).  This  adjustment  had  little 
affect  on  the  wind  speed  values  given  the  small  distances  involved.  The  MM5  model  output 


Location  map  of  monitoring  sites 


Longitude 

Figure  2:  The  location  map  of  monitoring  sites. 

fields  were  generated  using  a  15-km  Arakawa  C  grid  (see  Figure  3)  This  grid  spacing  was  selected 
l)ecause  it  is  similar  to  the  finest  mesh  produced  by  operational  models  run  at  the  National  Centers 
for  Environmental  Prediction  (NCEP),  and  based  on  discussions  with  Jay  Titlow  at  Weatherflow, 
Inc.  Winds  from  the  lowest  model  layer  were  used  after  adjustment  to  the  10m  level.  In  this 
analysis,  data  from  21  July  2002  have  been  examined  in  detail  for  the  full  24-hr  period.  The 
complicated  flow  patterns  over  the  region  during  this  time  are  evident  in  Figure  4.  The  arrows 
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Lcxation  map  of  model  output 


LongitLide 

Figure  3:  Grid  for  MM 5  output 

indicate  the  direction  from  which  the  winds  are  coming,  while  the  length  of  the  stem  indicates 
wind  speed  in  meters  per  second.  The  plots  were  done  at  3-hr  intervals.  A  streamline  (streamlines 
are  lines  that  are  everywhere  tangent  to  the  instantaneous  wind  vector)  analysis  (not  shown)  of 
the  flow  fields  shows  alternating  areas  of  confluence  (areas  where  the  streamlines  tend  to  come 
together)  and  diffluence  (areas  where  the  streamlines  tend  to  spread  apart).  Confluence  may  or 
may  not  be  associated  with  mass  convergence,  while  diffluence  may  or  may  not  be  associated  with 
mass  divergence  depending  on  how  the  wind  speed  changes  in  these  zones  (Petterssen,  1956). 
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Figure  4:  Wind  field  mapSy  showing  wind  direction  and  speed. 
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An  easterly  (winds  from  the  east,  southeast  and  northeast)  wind  component  tends  to  dominate 
over  the  region  for  this  time  period.  The  3AM  plot  clearly  shows  an  area  of  confluence  on  the  west 
side  of  the  Bay.  At  GAM  this  area  of  confluence  has  been  replaced  by  an  area  of  diffluenee  over 
the  center  of  the  Bay.  Diffluenee  seems  to  persist  over  the  Bay  for  the  rest  of  the  period.  There  is 
little  evidence  in  these  plots  to  suggest  that  MM5  was  capturing  the  sea  breeze  circulation,  which 
observations  show  to  be  present. 

5.2  The  nonstationarity  for  wind  fields 

As  a  first  empirical  attempt  to  dealing  with  the  nonstationarity  inherent  in  these  kinds  of  environ¬ 
mental  data,  we  divided  the  spatial  domain  into  two  broad  categories:  land  and  water.  From  an 
atmospheric  boundary  layer  standpoint,  this  partition  appears  to  be  a  reasonable  way  to  approach 
the  problem  of  nonstationarity.  Using  the  u  wind  component,  v  wind  component  and  wind  speed 
from  the  model  output  at  noon  on  July  21,  2002,  the  K-means  cluster  procedure  was  used  to  fur¬ 
ther  subdivide  these  two  regions  to  help  identify  domains  of  stationarity.  We  iterated  this  process 
until  the  AIC  criteria  suggested  that  there  was  no  significant  improvement  in  the  estimation  of  the 
covariance  parameters  for  the  wind  speed  anomalies  (wind  value  at  each  location  minus  the  mean 
over  time  at  that  location)  using  the  theoretical  eovarianee  model  presented  in  Section  5.4.  The 
optimal  number  of  clusters  suggested  by  the  AIC  criteria  was  five.  The  five  subregions  are  shown 
in  Figure  5.  This  final  regional  arrangement  of  clusters  appears  reasonable  considering  atmospheric 
and  oceanic  processes  that  are  occurring  in  the  boundary  layer  on  this  day.  It  is  reasonable  to 
assume  that  there  will  be  some  changes  in  the  configuration  of  these  regions  as  time  passes  and 
the  boundary  layer  structure  changes.  For  the  purposes  of  this  work,  we  assumed  that  the  changes 
were  negligible. 

5.3  Spatial-temporal  trend 

A  spatial-temporal  trend  is  fitted  using  the  the  space-time  dynamic  model  proposed  in  Section 
3,  which  has  two  components.  One  is  the  overall  temporal  trend  for  the  process  Z  (the  truth), 
and  it  doesn’t  change  with  location.  The  other  part  is  the  temporal  variation  that  changes  with 
location.  The  eovariates  used  here  are  sine  and  cosine  functions  with  respect  to  two  different  periods 
(1 /frequency):  24  hours  and  12  hours,  which  capture  the  diurnal  and  half-diurnal  cycles.  These 
two  components  of  the  spatial-temporal  trend  are  shown  in  the  Figure  6.  This  trend  is  fitted  using 
a  Bayesian  hierarchical  framework.  The  parameters  are  estimated  using  the  mean  of  the  posterior 
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Figure  5:  Subregions  of  stationarity. 

distribution.  Figure  6  (a)  shows  the  overall  wind  spec'd  trend  over  time.  From  Figure  6  (a)  it 
appears  that  from  a  time  standpoint  the  highest  wind  speeds  over  the  region  as  a  whole  occur 
near  sunset,  while  the  minimum  wind  speeds  on  that  day  occur  near  sunrise.  Figure  6  (b)  shows 
the  temporal  variation  for  each  subregion.  The  anomalies  for  subregions  1  and  3  are  generally 
negative  over  time,  while  for  subregions  4  and  5  they  are  generally  positive.  Subregion  2  starts  out 
positive  but  then  becomes  negative.  The  large  positive  values  in  subregion  5  indicate  that  the  wind 
speeds  in  that  subregion  are  higher  than  the  average  during  the  mid-day  hours.  The  large  mid-day 
negative  values  for  subregion  1  indicate  that  the  wind  speeds  are  lower  than  average  during  the 
mid-day  hours  in  that  subregion. 

In  the  next  stage  of  our  hierarchical  design  we  estimate  the  covariance. 

5.4  Spatial-temporal  covariance  structure 
Empirical  covariance  analyses 

We  start  with  an  empirical  analysis  of  the  covariance  structure.  We  draw  empirical  covariance 
graphs  for  each  subregion  for  the  wind  speed  anomalies  (wind  value  at  each  location  minus  the  mean 
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overall  mean  structure 


variation  within  each  subregion 


Figure  6:  Spatial-temporal  trend  for  wind  speed.  Top  graph  (a):  overall  temporal  trend.  Bottom 
graph  (b):  mean  temporal  variation  for  each  subregion. 


over  time  at  that  particular  location).  Figure  7  clearly  shows  ridges  in  the  empirical  covariances 
for  subregions  1,  2,  3  and  5.  These  type  of  ridges  appear  in  separable  covariance  structures. 
On  the  other  hand,  the  empirical  covariance  plot  for  subregion  4  appears  to  be  smoother,  there 
does  not  seem  to  be  a  ridge  effect,  suggesting  a  noii-separable  covariance  structure.  The  tests 
for  stationarity,  separability  and  isotropy  proposed  by  Fuentes  (2003)  within  each  subregion  were 
performed.  Stationarity  was  found  to  hold  within  each  subregion.  Separability  was  found  to  hold 
within  all  subregions  except  for  subregion  4.  The  results  from  the  formal  tests  agree  with  the 
empirical  covariance  analyses. 

Theoretical  covariance  model 

We  model  the  covariance  of  Z  using  model  (17)  proposed  in  Section  2.2.2.  with  5  subregions  of 
stationarity  (as  in  Figure  5).  For  subregion  4  we  use  the  nonseparable  model  (18).  For  the  separable 
spatial-temporal  covariances  in  the  other  subregions  we  use  Matern  models  for  the  spatial  covariance 
and  exponential  models  for  the  temporal  covariance.  We  fit  the  parameters  for  the  spatial-temporal 
covariance  of  Z  using  a  Bayesian  framework.  In  our  covariance  model,  we  allowed  for  geometric 
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Figure  7:  Space-time  empirical  covariance. 


anisotropy.  Subregion  1  and  subregion  5  seemed  to  be  anisotropic,  which  means  that  the  spatial 
covariance  depended  not  only  on  distance  but  also  direction.  A  linear  transformation  was  used 
to  transform  the  coordinates  (x,  ?/).  The  new  coordinates  are  (x',j/')  =  where  R  is  a 

rotation  matrix  and  T  is  a  shrinking  matrix.  R  and  T  are  defined  as  follows: 


(cos  A  sin  A 
—  sin  A  cos  A 


and  T  = 


1  0 
0  ^ 


A  describes  the  angle  of  rotation  and  r  is  used  to  stretch  the  coordinates.  These  anisotropy  pa¬ 
rameters  are  estimated  with  their  posterior  distribution  as  well  as  the  other  covariance  parameters. 
The  prior  distributions  that  we  specified  for  the  sill,  the  temporal  range  and  the  spatial  range  are 
/(7(,  2).  The  prior  distributions  for  the  anisotropy  parameters  {A  and  r),  the  smoothness  parameter 
(ly))  for  the  Matern  class  and  the  scale  parameter  (scale  factor  P)  in  the  non-separable  case,  are 
uniform  distributions;  with  support  (0,  27r)  for  A,  and  support  (0,  oo)  for  the  other  parameters. 
The  posterior  densities  for  the  sill  parameter  of  each  subregion  are  shown  in  Figure  8.  The  difference 
in  the  subregions  (see  Figure  5)  are  also  reflected  in  the  posterior  distributions  for  the  sill  parame¬ 
ter.  The  largest  mean  of  the  posterior  distributions  was  found  for  subregion  4,  which  encompasses 
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Figure  8:  Posterior  distribution  for  the  sill  parameter. 

a  large  variety  of  land  and  water  surfaces.  These  large  variations  in  the  surface  roughness  charac¬ 
teristics  would  contribute  to  large  variations  in  wind  speed  across  the  region.  In  addition,  a  major 
contribution  to  the  wind  speed  variations  can  be  attributed  to  stability  differences  which  exist 
between  land  and  water  surfaces  due  in  large  part  to  the  differential  heating  experienced  by  these 
surfaces.  Subregion  4  is  the  least  homogeneous  of  the  five  subregions.  The  second  most  spatially 
diverse  subregion  is  subregion  1.  Its  sill  value  reflects  this  diversity.  In  addition,  the  significant 
differences  in  the  sill  parameter  is  evidence  for  the  nonstationarity  over  the  spatial  domain. 

The  posterior  distribution  for  the  spatial  range  shown  in  Figure  9  indicates  that  subregion  5  has 
the  highest  posterior  mean.  Again,  this  result  is  consistent  with  the  nature  of  the  subregion.  One 
would  expect  strong  spatial  continuity  in  this  subregion.  The  posterior  mean  is  lower  for  the  other 
subregions,  where  the  spatial  continuity  is  weak. 

The  spatial-temporal  covariance  model  (18)  is  used  for  subregion  4,  which  is  non-separable.  Figure 
10  shows  the  posterior  density  of  the  scale  parameter.  The  scale  parameter  takes  into  account  the 
change  of  units  between  the  spatial  and  temporal  domains. 

The  spatial  smoothness  parameter  (Figure  11)  does  not  change  much  from  subregion  to  subregion. 
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Figure  9:  Posterior  distribution  for  the  spatial  range. 

the  mean  of  the  posterior  distribution  for  this  parameter  is  always  between  .5  and  1.5. 
The  posterior  means  of  the  covariance  parameters  are  listed  below: 


Process 

sill 

spatial  range 

temporal  range 

smoothness 

A 

R 

scale 

subregion  1 

0.1615 

14.44 

2.02 

0.70 

2.78 

1.11 

subregion  2 

0.1275 

17.62 

1.64 

0.88 

subregion  3 

0.1200 

16.24 

1.92 

0.99 

subregion  4 

0.3073 

16.61 

1.42 

13.24 

subregion  5 

0.0944 

25.64 

1.73 

0.58 

1.18 

1.17 

The  estimated  Bayesian  covariance  plots  are  shown  in  Figure  12. 
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Figure  10:  Posterior  distribution  for  the  scale  parameter. 

5.5  Wind  field  mapping 

The  original  MM5  output  wind  speed  map  is  shown  in  Figure  13.  The  colors  in  the  image  represent 
wind  speeds  from  MM5,  and  the  value  on  top  of  the  image  is  the  observed  wind  speed.  Figure  13 
indicates  that  there  is  a  substantial  difference  between  the  MM5  output  and  the  observed  data. 
For  the  prediction  of  wind  fields  we  simulate  values  of  the  wind  speed  from  the  posterior  predictive 
distribution  of  the  true  underlying  wind  process  given  the  model  output  and  the  observations. 
Figure  14  shows  an  improved  wind  map  obtained  by  combining  model  output  and  observed  wind 
data. 

In  Figure  14,  the  color  of  the  background  is  the  mean  of  posterior  predictive  distribution.  The 
improved  map  agrees  better  with  the  observed  data.  In  order  to  quantify  this  improvement  we 
calculated  the  MSE  =  where  roi  is  the  observed  wind  speed  at  location  fi 

is  the  predicted  wind  speed  at  location  i  (without  using  rot)  arid  n  is  the  number  of  observations. 
The  MSE  for  the  MM5  forecast  is  10.32  and  with  our  approach  we  reduced  it  to  2.45. 
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Figure  11:  Posterior  for  sinoothness  parameter. 


6  Discussion 

In  this  paper  we  introduce  new  statistical  models  for  spatial  prediction  of  wind  fields  using  dif¬ 
ferent  sources  of  data.  We  combine  spatial  data  by  relating  the  spatially  varying  variables  to  an 
underlying  unobservable  true  wind  field  process  and  we  predict  this  latent  process.  This  is  different 
from  the  traditional  geostatistical  approaches  of  cokriging  or  kriging  with  external  drift  (KED) 
for  spatial  interpolation  with  auxiliary  covariates,  as  used  by  Phillips  et  al  (1997),  and  Davis  et 
al.  (2000).  In  cokriging  or  KED  we  write  the  data  in  a  regression  model  as  a  function  of  some 
covariates.  Therefore,  if  the  observed  data  and  the  output  of  numerical  models  exhibit  different 
spatial  resolution  or  are  observed  at  different  locations,  we  could  not  write  directly  the  observed 
data  as  a  function  of  the  output  of  numerical  models  or  vice  versa.  In  this  paper  we  overcome  that 
problem  by  writing  each  source  of  information  in  terms  of  the  truth,  without  assuming  we  observe 
it.  The  general  statistical  models  introduced  in  this  paper  for  spatial  temporal  processes  allow  for 
lac'k  of  stationarity,  isotropy  and  they  do  not  assume  separability. 

From  a  meteorological  viewpoint,  this  technique  could  be  used  to  improve  the  assimilation  of  the 
surface  wind  field  observations  into  the  meteorological  model  initial  field.  In  addition,  it  could 
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Figure  12:  Space-time  covariance  with  parameters  estimated  with  the  mean  of  their  posterior  dis¬ 
tributions. 

serve  as  a  post-processing  algorithm  which  would  be  run  on  meteorological  model  output  fields. 
The  meteorological  eoinmunity  has  been  working  on  the  data  assimilation  problem  for  many  years. 
This  research  has  produced  some  very  sophisticated  methodologies.  For  a  current  review  of  this 
literature  see  Kalnay  (2003).  The  work  by  Daley  (1991)  and  Talagrand  (1997)  should  also  be 
consulted. 

Kalnay  (2003)  points  out  that  early  in  the  history  of  the  the  development  of  numerical  weather 
prediction  techniques  it  was  found  that  a  complete  first  guess  estimate  of  the  state  of  the  atmosphere 
was  required  in  addition  to  the  traditional  set  of  observations.  The  first  guess  field  should  be  the 
meteorologist’s  best  estimate  of  the  state  of  the  atmosphere  before  incorporating  the  observations. 
In  a  statistieal  sense,  this  is  similar  to  the  speeifieation  of  a  prior  in  Bayesian  analysis.  Currently 
short-range  forecasts  serve  as  the  first  guess  field  in  what  has  eoine  to  be  called  the  ’’analysis 
cycle.”  According  to  Kalnay,  the  analysis  cycle  is  an  intermittent  data  assimilation  procedure  which 
typically  uses  a  6-hr  cycle  performed  four  times  a  day.  In  areas  where  there  are  many  observing 
sites,  the  observed  data  have  the  most  influence  on  the  model  initial  field,  while  in  areas  where 


24 


MM5  output  at  Jufy  21  12:00pm 


M> 


n 


-770 


“76  5 


“76  0 


“75,5 


-75,0 


Figure  13:  Original  MM5  output  wind  speed. 

observational  data  are  scarce  the  model  first  guess  field  is  important  since  it  can  build  on  the 
observational  data  from  data  rich  areas.  The  procedure  outlined  in  this  paper  provides  a  powerful 
technique  for  combining  the  first  guess  field  with  the  available  observations. 

The  improvement  obtained  in  the  mean  square  error  when  the  meteorological  model  output  and  the 
observed  data  are  combined  is  very  good.  However,  these  results  arc  based  on  a  very  limited  time 
period  and  have  not  been  compared  to  other  currently  available  methods.  While  the  techniques 
presented  in  this  paper  have  great  promise,  both  further  testing  on  larger  data  sets  and  comparison 
with  exisitng  techniques  will  show  their  true  worth. 

The  technique  itself  does  hold  some  promise  for  use  in  the  forecasting  mode.  As  noted  in  the 
paper,  the  subregions  were  derived  based  on  the  noon  data  for  one  day.  Even  under  weakly 
forced  synoptic  flow  regimes,  these  subregions  of  stationary  can  be  expected  to  shift  from  hour  to 
hour.  One  alternative  might  be  to  group  hours  together  based  on  the  diurnal  restructuring  of  the 
atmospheric  boundary  layer.  Using  this  technique,  the  subregimes  may  be  more  stable  from  day 
to  day.  In  addition  the  day-to-day  stability  of  the  covariance  function  should  be  examined.  Under 
weakly  forced  flow  conditions  these  functions  may  show  enough  stabiltiy  to  be  useful  in  the  forecast 
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Figure  14:  Improved  wind  speed  by  combing  data 


mode. 
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